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Facility for Highlighting Documents Accessed Through Search or Browsing 



Field of the Invention 

5 This invention relates generally to the field of computers, and in 

particular to enhancing query results provided by a search engine. 

Copyright Notice/Permission 

A portion of the disclosure of this patent document contains material 
10 which is subject to copyright protection. The copyright owner has no objection 
to the facsimile reproduction by anyone of the patent document or the patent 
disclosure as it appears in the Patent and Trademark Office patent file or records, 
but otherwise reserves all copyright rights whatsoever. The following notice 
apphes to the software and data as described below and in the drawing hereto: 
15 Copyright © 2000, Microsoft Corporation, All Rights Reserved. 

Background 

The World Wide Web (WWW), often referred as the Web, is a fast 
growing network that involves a vast quantity of data and numerous types of 
20 services aimed at accessing, organizing, and distributing that data. In particular, 
there are millions of documents on the Web and many on-line search services 
that enable the users to find documents that are of interest to them. 

Furthermore, documents on the Web are linked via hyperlinks, created by 
the authors of the documents, which enable the users to browse through 
25 documents on their own by following the links that interest them. 

The large quantity of the Web data and the fast rate of Web expansion 
have immanent implications on the ways the services on the Web can approach 
the problem of processing Web data. 

Collecting and processing all or a majority of Web documents with an 
30 appropriate rate of updating the information that has been collected about these 
documents is often not feasible. Indeed, the processing power and the network 
bandwidth are not yet up to the task. However, there is also a more fundamental 
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reason: because of the distributed nature of the data the services are not in 
control of the document change - the authors of Web documents can change 
them at any time, as needed. That is why^ among other reasons, search engines 
do not deUver the document text in response to the user's query. The search 
engines at best dehver the title and some type of summary of a document that is 
created by the search engine based on the version of the document available at 
the time the document was collected and indexed. The search engine points the 
user to the URL, i.e., the location of the document on the Web at the time the 
document was collected. It is up to the user then to execute the URL link and 
access the document text, which may or may not be the same as the text 
processed and sunamarized by the search engine. 

This lack of control over the content of documents on the Web requires 
new approaches in providing some of the basic and commonly provided 
document management features of traditional document management systems. 
Such features include: marking of the query terminology in the document text to 
help the user identify the portions of the text that talk about the desired topic, to 
assess the document relevance to the topic, etc.; summarizing document text to 
extract most salient sentences or query specific portions of the text; analyzing 
the text to identify and extract entities that may be of particular interest to the 
user, e.g., person names, company names, locations, etc., or relations among 
these entities; creating various visual representations of the document to help 
with browsing through the document, assessing document relevance, etc. 

Since the documents on the Web are fi-equently accessed in the browsing 
mode by following the hyperlinks in the documents, the same type of document 
management support is needed for browsing among and through Web 
documents. 

Furthermore, since the type and the quality of services on the Web vary, 
the users on the Web often need to explore which of them can handle best a 
particular request for information. For example, if the user is engaging a couple 
of search engines to find certain types of documents, this often involves retyping 
the query in the appropriate search window of the individual search engines. 
There is a need for a facility that can assist the user in specifying the user's 
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information need and that creates various representations of that need suitable 
for interfacing with various Web services. 

In summary, there is a need to provide the user with the faciUties for 
obtaining better information regarding the relevancy of documents pointed to by 

5 various services on the Web or accessed by browsing the Web documents. 

There is a further need to provide such information based on the current versions 
of the documents. There is still a further need to provide the user with a 
consistent manner in which such relevancy is identified regardless of the way the 
document is accessed (based on a Web service information or browsing or the 

10 combination of). There is yet a further need to provide a rich representation of 
the user's information need. 



Summary of the Invention 

15 An information highhghting facility on a computer assists the user in 

searching, browsing, and reading documents on the Web or similar distributed 
network environments. When the user downloads a document from the Web, 
e.g., by following a hyperiink while browsing the Web or by choosing one of the 
documents that a search engine (or some other Web service) found relevant to a 

20 previously issued query, the information highhghting facility provides 

information to assist the user in determining whether the document is of interest 
to the user. The facility matches the document text with a model of the user's 
information need that has been created by the facility (independently from the 
services that the user is using on the Web) and supports a number of document 

25 analyses. 

In the case of search, the document text is analyzed with respect to the 
user's specified information need. In this instance, the assistance in assessing the 
document relevance may be provided by marking keywords or key phrases 
within documents to make them easier to spot, by scrolling to what seems to be 
30 the most relevant portion of the document, etc., or by combinations thereof 

Additional assistance can be by extracting specified features from the document 
such as company names, person names, location names, etc., by summarizing 
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docimients in view of the user's query, by constructing thumbnail images of 
^ documents with query terms highhghted, etc. Furthermore, the facihty can 

provide alternative ranking of documents pointed to by the search engine on the 
basis of the richer representation of the user's need that the facility created. That 
5 can be achieved by pre-fetching, analyzing, and re-ranking a selection of 
documents that were originally pointed to by the search engine. 

In the case of browsing, for example, the user can specify in advance or 
at the time the document is accessed, a perspective from which the user wants 
the document to be analyzed. For example, the user can provide the information 
10 highlighting facility with a description of the topic the user is interested in or 
other for analyzing documents criteria (e.g., a format specification of the 
document). This description of the user's preferences can be applied to analyze 
the accessed documents (currently and subsequently) as well as used to give a 
Ip relevance assessment of the documents pointed to by the hyperlinks in the 

15 currently viewed document. Relevance assessment of hyperlinks could be 
'•J achieved, for example, by downloading and analyzing the linked documents in 

lyj the background and providing the user with the qualitative characterization of 

the hnks. 

J5 To assist the user in reading and assessing the documents, the 

|fi 20 information highhghting facihty creates a description or a model of the user's 

need or interest. This model is used as the basis for various document analyses. 
13 Model may include, but is not limited to, descriptions of queries that the user is 

sending to search engines on the Web, a general 'profile of interest' that the user 
specified (e.g., by means of a dialog), the augmented versions of these 
25 descriptions that the highhghting facihty created based on further linguistic 

and/or semantic analysis, or additional information that the highlighting facility 
may collect or infer about the user's current task. The user may also request 
some generic types of analysis to be applied, e.g., extraction of certain types of 
entity names or entity relations that may be contained in the document. This 
30 model of the user interest serves as a context for the analysis of the accessed or 
pre-fetched documents. 
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The processing required for the construction of the model can be done 
^ locally using facilities on the user's computer, or as an external service (e.g., at a 
dedicated server on the network), or as a combination of the two. Furthermore, 
the model construction can be done simultaneously and independently from the 
5 other services that the user is using on the Web (e.g., search engines). 

The information highhghting facihty applies the model to the documents 
that are accessed by the user, or if required for some types of analysis, by pre- 
fetching the documents in the background. The results of the various analyses 
are presented appropriately (by inserting mark ups in the document, extracting 
10 information into separate windows, or creating various other visual 
representations). 

The facility also provides support for managing various user interest 
models and applying them by the user as needed both for document analysis and 
for interfacing with other Web services (e.g., the user can simply point to the 
15 portion of the model representation that needs to be sent to a particular Search 
service as a query). 

The principles on which the information highhghting facility is built 
allow for incorporation of various types of document analysis. For example, it 
can include but is not limited to: terminology marking, scrolling, re-ranking, 
20 document thumbnailing, sunmiarization and link analysis. 



Brief Description of the Drawings 

Figure 1 is a block diagram of a computer system on which the present 

invention may be implemented. 
25 Figure 2A is a block flow diagram showing interaction of the present 

invention with a Web based information service (e.g., a search 

engine) and browser. 
Figure 2B is a block flow diagram of a service for creating a model of the 

user's interest and management of documents and document 
30 requests. 

Figure 3 is a flow diagram showing the flow of creation of a context and 

its application to documents to provide highhghting. 
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Figure 4 



Figure 5 



Figure 7 



Figure 6 



is a block diagram showing components involved in providing 
augmented search terms and highlighting, 
is a flow diagram showing scrolling of a document to its most 
relevant portion. 

is a flow diagram showing re-ranking of documents provided by a 
search engine. 

is a flow diagram showing the identification and provision of a 



list of names associated with a document. 



Figure 8 



Figure 9 



is a flow diagram showing the creation of a thumbnail of a 
document with highUghting. 

is a flow diagram showing the creation of a summary of a 



document. 



Detailed Description 



In the following detailed description of exemplary embodiments of the 
invention, reference is made to the accompanying drawings which form a part 
hereof, and in which is shown by way of illustration specific exemplary 
embodiments in which the invention may be practiced. These embodiments are 
described in sufficient detail to enable those skilled in the art to practice the 
invention, and it is to be understood that other embodiments may be utilized and 
that logical, mechanical, electrical and other changes may be made without 
departing fi-om the spirit or scope of the present invention. The following 
detailed description is, therefore, not to be taken in a hmiting sense, and the 
scope of the present invention is defined only by the appended claims. 

The detailed description is divided into multiple sections. A first section 
describes the operation of a computer system which implements the current 
invention. This is followed by a high level description of the invention, 
including how the model of the user's interest is generated and used. Further 
embodiments are then described, including re-ranking of documents and 
extracting and generating information fi-om the documents to further assist the 
user in reading and assessing the accessed documents. 
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Hardware and Operating Environment 
Figure 1 provides a brief, general description of a suitable computing 
environment in which the invention may be implemented. The invention will 
hereinafter be described in the general context of computer-executable program 
5 modules containing instructions executed by a personal computer (PC). Program 
modules include routines, programs, objects, components, data structures, etc. 
that perform particular tasks or implement particular abstract data types. Those 
skilled in the art will appreciate that the invention may be practiced with other 
computer-system configurations, including hand-held devices, multiprocessor 

10 systems, microprocessor-based programmable consumer electronics, network 
PCs, minicomputers, mainframe computers, and the like which have multimedia 
capabilities. The invention may also be practiced in distributed computing 
environments where tasks are performed by remote processing devices Unked 
through a communications network. In a distributed computing environment, 

15 program modules may be located in both local and remote memory storage 
devices. 

Figure 1 shows a general-purpose computing device in the form of a 
conventional personal computer 20, which includes processing unit 21, system 
memory 22, and system bus 23 that couples the system memory and other 

20 system components to processing unit 21 . System bus 23 may be any of several 
types, including a memory bus or memory controller, a peripheral bus, and a 
local bus, and may use any of a variety of bus structures. System memory 22 
includes read-only memory (ROM) 24 and random-access memory (RAM) 25. 
A basic input/output system (BIOS) 26, stored in ROM 24, contains the basic 

25 routines that transfer information between components of personal computer 20. 
BIOS 26 also contains start-up routines for the system. Personal computer 20 
further includes hard disk drive 27 for reading from and writing to a hard disk 
(not shown), magnetic disk drive 28 for reading from and writing to a removable 
magnetic disk 29, and optical disk drive 30 for reading from and writing to a 

30 removable optical disk 3 1 such as a CD-ROM or other optical medium. Hard 
disk drive 27, magnetic disk drive 28, and optical disk drive 30 are connected to 
system bus 23 by a hard-disk drive interface 32, a magnetic-disk drive interface 
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33, and an optical-drive interface 34, respectively. The drives and their 
associated computer-readable media provide nonvolatile storage of computer- 
readable instructions, data structures, program modules and other data for 
personal computer 20. Although the exemplary environment described herein 
employs a hard disk, a removable magnetic disk 29 and a removable optical disk 
3 1 , those skilled in the art will appreciate that other types of computer-readable 
media which can store data accessible by a computer may also be used in the 
exemplary operating environment. Such media may include magnetic cassettes, 
flash-memory cards, digital versatile disks, BemouUi cartridges, RAMs, ROMs, 
and the like. 

Program modules may be stored on the hard disk, magnetic disk 29, 
optical disk 31, ROM 24 and RAM 25. Program modules may include operatiag 
system 35, one or more apphcation programs 36, other program modules 37, and 
program data 38. A user may enter commands and information into personal 
computer 20 through input devices such as a keyboard 40 and a pointing device 
42. Other input devices (not shown) may include a microphone, joystick, game 
pad, satellite dish, scanner, or the like. These and other input devices are often 
connected to the processing xmit 21 through a serial-port interface 46 coupled to 
system bus 23; but they may be connected through other interfaces not shown in 
Figure 1, such as a parallel port, a game port, or a universal serial bus (USB). A 
monitor 47 or other display device also connects to system bus 23 via an 
interface such as a video adapter 48. In addition to the monitor, personal 
computers typically include other peripheral output devices (not shown) such as 
speakers and printers. 

Personal computer 20 may operate in a networked environment using 
logical connections to one or more remote computers such as remote computer 
49. Remote computer 49 may be another personal computer, a server, a router, a 
network PC, a peer device, or other common network node. It typically includes 
many or all of the components described above in connection with personal 
computer 20; however, only a storage device 50 is illustrated in Figure 1. The 
logical connections depicted in Figure 1 include local-area network (LAN) 51 
and a wide-area network (WAN) 52. Such networking environments are 
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commonplace in offices, enterprise-wide computer networks, intranets and the 
Internet. 

When placed in a LAN networking environment, PC 20 connects to local 
network 51 through a network interface or adapter 53. When used in a WAN 
5 networking environment such as the Internet, PC 20 typically includes modem 
54 or other means for establishing communications over network 52. Modem 54 
may be internal or external to PC 20, and connects to system bus 23 via serial- 
port interface 46. In a networked environment, program modules, such as those 
comprising Microsoft® Word which are depicted as residing within 20 or 
10 portions thereof may be stored in remote storage device 50, Of course, the 
network connections shown are illustrative, and other means of estabhshuig a 
communications link between the computers may be substituted. 

Software may be designed using many different methods, including 
object oriented programming methods. C++ and Java are two examples of 
1 5 common obj ect oriented computer programming languages that provide 

fimctionality associated with object-oriented programming. Object oriented 
programming methods provide a means to encapsulate data members (variables) 
and member fimctions (methods) that operate on that data into a single entity 
called a class. Object oriented programming methods also provide a means to 
20 create new classes based on existing classes. 

An object is an instance of a class. The data members of an object are 
attributes that are stored inside the computer memory, and the methods are 
executable computer code that act upon this data, along with potentially 
providing other services. The notion of an object is exploited in the present 
25 invention in that certain aspects of the invention are implemented as objects in 
one embodiment. 

An interface is a group of related fimctions that are organized into a 
named unit. Each interface may be uniquely identified by some identifier. 
Interfaces have no instantiation, that is, an interface is a definition only without 
30 the executable code needed to implement the methods which are specified by the 
interface. An object may support an interface by providing executable code for 
the methods specified by the interface. The executable code supplied by the 
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object must comply with the definitions specified by the interface. The object 
may also provide additional methods. Those skilled in the art will recognize that 
interfaces are not limited to use in or by an object oriented programming 
environment. 

Invention Overview 

A block flow diagram of operation of the invention is shown in Figure 
2A generally at 200. An information highUghting facility, is designated as 
Information highlighting facility 210 as shown in Figure 2A and 2B. The term 
highUghting facility refers to multiple fimctions used to highUght the relevancy 
of one or more documents as described below. It is not meant to be a term that 
refers only to the common fimction of highUghting text. The information 
highUghting facility also includes a document analysis facility to analyze 
documents prior to applying highUghting functions. 

A user's information need is represented at 205 in Figure 2A. The need 
is communicated to a means of accessing the web, such as a web browser 208, 
and to a information highUghting facility 210. The information highUghting 
facility 210 creates a model of the user's information need that is more or less 
independent of the expression of the user's information need that is 
communicated by the user to a particular information providing service 212 
(e.g., search engines on the Web). The information providing service 212 also 
comprises an index 213 that identifies documents 214 by means of an address or 
URL from which a web browser 217 may retrieve and display documents. 
Documents may also be provided directly to the information highlighting facility 
210. 

Input to the information highlighting facility 205 can be, for example, a 
single query or a set of queries 215 communicated by the user to the Web 
information providing service 212 (e.g., queries to a Search engine). These 
queries are in one embodiment captured fi*om the Web page of a search engine at 
the time the user types a query into the search box provided by a user interface 
216. This is referred to as an implicit characterization of the user's information 
need since it was not directly communicated to the information highUghting 
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facility 210, but rather captured by the information highlighting facihty 210 by 
monitoring the user's actions. Similarly, the system used by the user can 
monitor user's actions and provide information on the task the user is performing 
218 (e.g., working on a report, sending an e-mail message, etc.) as a context for 
5 the information highhghting facihty analysis to create the model of the user's 
information need. 

In another embodiment the information highhghting facility provides a 
query box that serves the purpose of specifying the query. The specified query is 
then sent (copied and pasted, dragged and dropped) to the search box 216 of a 

10 desired search engine. The user is then not required to retype the query when 
changing from one search engine to another. 

Another, more explicit way of providing information highhghting facility 
210 with the characterization of the user's need is by using a user's specification 
of the task and intentions at 218 (for example, in a form of a dialogue with 

1 5 information highhghting facility 2 1 0) and/or the user' s detailed description of 
the information need at 220 (a direct input to information highhghting facility 
210). Note, parts or all of the full description of the user's need are then useable 
for communicating with a particular information providing service (e.g., a search 
engine to information directory on the Web). 

20 Information highlighting facility 21 0 is provided with a GUI 222 

(graphical user interface) that enables direct input from the user. In particular, 
the user may specify a desired type of information highlighting facility 210 
analysis that should be apphed to the viewed documents, with details on the 
parameters to be used in the analysis (when required) and preferences on the 

25 display of results as indicated at 223. Furthermore, the user may provide 

information on a particular task the user is currently performing as represented at 
224 to ensure that the analyses are context sensitive when applicable. 

Information highhghting facihty 210 contains a module 225 for 
managing past requests for information analysis (e.g., storing, retrieving, 

30 concatenating queries and information need descriptions) and/or documents that 
have been downloaded and analyzed. 
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Information highlighting facility 210 analyses typically involve three 
components: format recognition and analysis module 227, content analyses 228 
(e.g., linguistic and statistical analysis of the text), and resources 229 required 
for the analyses (e.g., linguistic and knowledge resources for identifying 
5 company names in the text) . 

The user specifies the information need 205 to information highlighting 
facility 210 directly or indirectly by communicating it to the Web information 
providing service 212. The system or the user may also provide information on a 
task that the user is currently performing. The user also specifies the type of 
10 information highlighting facility analysis that should be performed on the 
accessed documents. 

This request for information is communicated via Web browser 217 to 
the information providing service. As a result, the user is provided with URL's 
and perhaps some additional information about documents that potentially 
15 satisfy the user's information need. For example, in case of Web search engines, 
the result of a search is typically a ranked Hst of document titles with short 
summaries and URL's. 

Based on the task context 224 and the specification of the user's 
information need, information highhghting faciUty 210 creates a model of the 
20 user's information need represented at 232. 

Figure 2B provides further information about process flow of the 
invention. The numbering of modules is consistent with Figure 2A. Information 
highlighting facility 210 provides several features to enhance or highhght 
documents as indicated at 240. Such features may include terminology 
25 highhghting, document scrolling, entity extraction and relation finding, 

hyperlink analysis, document relevance ranking, document thumbnails, and 
document summarization. 

As an example of the process flow, if the user desires to have relevant 
terminology from the information request highlighted in the accessed 
30 documents, information highlighting facihty 210 processes the request for 

information using linguistic analysis tools 228 and knowledge resources 229 to 
create a rich model 232 of the topic of interest. For example, it may perform 
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synonym expansion of the original terms in the information request to ensure 
• that relevant information is highlighted in the document without the need for the 
user to try to anticipate the linguistic variations ui which the topic is described in 
the text. 

5 As the user accesses a document, the model of the user's information 

need is used in the analysis of the document. For example, terminology 
highlighting is achieved by detectmg in the document text (e.g., pattern 
matching) the terminology from the rich Unguistic representation of the user's 
information need created by information highhghting facility 210. The user can 

1 0 specify various parameters related to terminology highlighting at 223 . For 

example, the user may prefer to have terminology from the original description 
of the information need highUghted in one color while all the synonyms in some 
other color. Or, perhaps, the user may want only the occurrence of multi-word 
phrases from the request highlighted in the document, etc. 

1 5 Some types of information highlightuig facility analysis may require pre- 

fetching the document text in the background as the user is performing other 
tasks, e.g., viewing the result list from the search engine. For example, suppose 
that the user requested that thumbnail images of documents that were mdicated 
by the search engine be displayed with query terminology highlighted in them. 

20 In that case, the text of documents from the search result page being viewed by 
the user could be downloaded in the background as represented by 
communication line 245, analyzed for query terminology and document layout 
and the highlighted thumbnail images would be displayed. 

Similarly, suppose that the user requested an alternative ranking of the 

25 search result based on the rich information highlighting facility representation of 
the user's need (as oppose to the short query that the user may have 
communicated to the search engine). The document text of some selected 
documents (e.g., top N ranked documents) could be pre-fetched in the 
background, linguistically and statistically processed, and compared with the 

30 information highhghting facility 2 1 0 model of the user's interest. The documents 
would be scored and alternative ranking of them presented to the user. 
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Many of the information highlighting facihty 210 analyses could be 
equally applied to the documents accessed as the user is browsing through the 
documents. 

Information highlighting facility 210 may be implemented as a local 
5 service on the user's desktop or as a remote service, or can be a combination of 
the two. For example, some information highlighting facility 210 analyses could 
employ the locally available resources (e.g., thesauri or knowledge base that the 
user may have available locally). 

When applied as a remote service (and thus used by a number of users), 
10 information highlighting facility 210 could benefit from the information it may 
store on the user community. For example, it may store some types of analysis of 
documents that have been performed as a result of the users' requests within a 
certain period of time (e.g., an hour, or a day, etc.). 
ip For example if a user A requested that the accessed documents be 

15 analyzed for company names and person names, information highUghting 
H faciUty 210 can perform this analysis and store the analysis results. When a user 

i=7i B accesses the same document and asks for the same analysis the results could 

be delivered without repeating the document analysis (and thus saving the 
^ processing time). 

f;^ As indicated above, information highlighting facility 210 captures 

ip information aboxit the user's need. This can be done, in one embodiment, based 

on the queries that the user issues to the Web Search engines or different Web 
services at the service Web site. It can also be based on the user's description of 
25 the user's interest or information need communicated directly to information 
highhghting facility 210 through the information highlighting facility interface 
222. Furthermore, the information highhghting facility 210 may make inferences 
or collect from the user expUcitly (e.g., through a dialog) information about the 
user's task or intentions or preferences about the characteristics of docixments 
30 (e.g., format of the documents that the user wants to access or avoid) or similar. 

Based on the collected information, the information highlighting facility 
210 builds the representation or model of the user's interest. This model than 
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provides a context for analysis and information highlighting of any document 
accessed by the user. In one embodiment these are the documents downloaded 
from the Web. However, information highlighting facility 210 can be extended 
with components that recognize formats of docxmients from various sources 

5 (e.g., documents created by appUcations running locally on the user's desktop, 
documents delivered via e-mail, etc.). All information highUghting facility 210 
features could then be apphed to the content of those documents and the results 
displayed appropriately. 

Users may access documents by directly executing a URL of the desired 

1 0 document via the browser 2 1 7 or may follow a hyperlink in the currently viewed 
document or may select to access documents from a Ust of URLs presented to 
the user by a Web service (Search or others) as a result of the user's request for 
information. 

As the documents are downloaded by the browser 217 they are processed 
15 by the information highhghting facility 210 in view of the model of the user's 
interest. The results of the information highhghting facility 210 processing are 
then displayed appropriately to the user. Information highlighting facility 210 
may include a number of different features and supporting analyses comprising 
but not limited to: marking of terminology in the text, scrolling to the relevant 
20 passages in the document, extracting specified entity names and relations among 
entities in the text, summarizing documents by selecting sentences saUent to the 
content of the document, or related to the query, etc., ranking documents in a 
designated document set with respect to the information highUghting facility 210 
representation of the user's need, analyzing hyperlinks in the viewed documents 
25 with respect to the user's need, and creating various visual representation of the 
documents, such as thumbnail document images with highlighted information in 
the document text and hyperlinks to support reading of and browsing through the 
document text. 

The information highlighting facility 210 provides support for storing 
30 and managing various models of the user's interests. In particular it enables the 
user to select which of the existing models or combination of the existing models 
should be used as the context for the analysis of documents. 
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If the user wishes to engage Search or similar Web services for 
information seeking the user's queries or parts of the comprehensive information 
highhghting faciUty 210 model of the user's interest 232 are sent via brov^ser 
217 such as Internet Explorer for processing by the service 212. The user 

5 interface 216 running on the service end receives queries and performs the 
search operation over the documents that have been collected and processed by 
the service. Typically the services store information about the documents, 
including the document URL (uniform resource locator) in the form of index 
213. As a resuh of the query processing, document identifiers, such as URLs, 

10 are retrieved jfrom the index and typically ranked in relevance to the queries. 
The URLs are sent back to the cUent. 

In one embodiment, the user's interest model is generated by analyzing 
the query terms as entered by the user in 216. This may involve creating an 
augmented set of search terms based on syntactic analysis and semantic 

15 expansion of the user's query. The information highhghting facility 210 then 
provides highhghting of the original and expanded query terminology in the 
documents accessed upon the user request (via document identifier, the URL). 
Furthermore, the information highhghting facility 210 may use information 
about the wider context, e.g., the user task or user's exphcit preferences to 

20 perform the terminology highhghting appropriately. For example, to support 
more efficient reading of the document, information highlighting facility 210 
may perform selective terminology highlighting in the text by highlighting only 
key concepts from the user's interest model in the paragraphs that are assessed as 
most relevant to the user's need. 

25 In one embodiment the information highlighting facihty 210 receives the 

hst of URLs from the Search engine or other Web service and begins to 
download documents 214 identified via browser 217 in the background (while 
the user is performing other tasks, like reading the resuh hst, etc.) in order to 
perform the linguistic and statistical analysis of the document texts. MS Read 

30 then re-ranks the documents with respect to their relevance to the user's interest 
model, a more comprehensive representation of the user's interest than the one 
presented by the user to the Search or some other Web service 212. 
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In one embodiment, infomiation highlighting facihty 210 perfomis 
document analysis without a need for downloading and analyzing the document 
text in advance or in the background. This is done based on simple text analysis 
that requires no significant overhead in the processing time than it is required to 
5 download and display the document. In still a further embodiment, other 

document analysis can be performed in the background as represented by line 
245. This analysis may be more involved and require each document to be 
downloaded. Both approaches can be used to support entity extraction and 
relation finding, document summarization, etc. 
10 In case that the user engages in browsing through Web documents the 

user can either specify an existing context, i.e., a model of the user's interest or 
need that information highlighting facility 210 created previously or can initiate 
a creation of the new one by providing information to the information 
highlighting facility 210 in various forms, including but not limited to a 
15 description of a particular topic interest, preferences, intentions and purpose of 
the browsing task, etc. Information highlighting facility 210 then creates the 
appropriate user's interest model as described above and applies them to the 
documents as the user browses the Web. In one embodiment, the information 
highhghting facility 210 downloads in the background the documents that are 
20 pointed to by the hyperlinks in the currently viewed document. These documents 
are analyzed with respect to the current model of the user's interest. The result of 
the analysis is information to the user about the relevance of the hyperlinks and 
suggestion for further steps in browsing. In other embodiments the hyperiink 
analysis is performed by the information highlighting facility 210 based on the 
25 text in the current document that surrounds the hyperlinks, thus without the need 
to download the linked documents in the background. 

Analyses performed by the information highlighting facility 210 can be 
performed locally, using the local information resources as needed (linguistic 
resources such as lexicons, dictionaries, knowledge base, etc) or remotely or as 
30 a combination of the two. The types of analyses include but are not limited to: 

Terminology marking. When a document is downloaded, the terminology 
describing the user model can be highUghted, for example, by making keywords 
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and key phrases bolder than the surrounding text, or by changing the background 
color to faciUtate easier spotting in the text. In one embodiment this type of 
terminology marking can be done at the time the document is downloaded. In 
another embodiment, a more sophisticated terminology marking is provided by 
5 pre-fetching and analyzing the document text in the background (e.g., while the 
user is performing other tasks, such as reading the document titles in the result 
sets of the search engines). 

Scrolling. When a document is downloaded, it can be scrolled, for 
example, to the most relevant portion of a multi-page document. This can be 
10 done, for example, by statistical and linguistic analysis of the text that involves 
scoring individual paragraphs or subparts of the document with respect to the 
user model. Alternatively, it may be based on a simple statistical analysis of the 
occurrences of terminology from the user's interest model in the text at the time 

ip the document is being downloaded, thus with no need for pre-fetching the 

in 15 document text. 

2 Re-ranking. The list of documents provided by one or more search 

y engines may be re-ranked based on relevance ranking and based on a 

^ representation of the user's need. The re-ranking may be based on bixt not 

^ restricted to the analysis of information from the simmiaries provided by the 

§1 20 search engines or by pre-fetching the document text and performing additional 

% relevance assessment. This analysis may range fi-om simple pattem matching of 

O the document text and the terminology in the user model to deeper hnguistic and 

statistical analyses and relevance scoring of the document texts. 

Document Thumbnailing. Based on a downloaded document, a thumbnail 
25 image of the document may be created with or without highlighting of various 
information found in the document text (e.g., the user query term, the expanded 
model of the user need, most salient sentences in the text, etc.). Links from the 
thumbnail image to the document text could be provided to enable easy 
browsing through the document. By providing visual cues, the thimibnail image 
30 of a document provides assistance is assessing the relevance of the whole or 
parts of the document. 
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Summarization, A summary of the document text can be provided by but 
is not restricted to extracting salient sentences from the text as identified, for 
example, by pattern matching with the terminology of the user's interest model 
or by a deeper linguistic and statistical analysis of the document text. In one 
5 embodiment, the summaries are generated based on various entities and entity 
relations found in the text, related to or independent from the current user's 
interest model 

Link analysis. The internal and external links on a web page can be 
assessed by, for example, downloading the text of the linked documents in the 
10 background and assessing their utility with respect to the user model. Such 
information may be communicated to the user as an aid in deciding whether or 
not to follow the links. 

In Figure 3, a terminology highlighting or marking facility, which is one 
ip of the features of the information highUghting facility 210 is indicated generally 

15 at 3 1 0. The terminology highUghting faciUty consists of a client component 315 
(i.e., highUghter) that can be an independent appUcation or part of a browser. 
id The highUghter operates in one of two modes: query mode 320 and profile mode 

325. The highUghting facility also consists of an analyzer 330. 

In the query mode, when a query is issued, the highUghter captures the 
||j 20 query at 335 (such as from the search window on the search engines web page) 

^ as entered by the user and sends it to the analyzer 330 for syntactic analysis and 

Q semantic expansion. 

''"^ Note that instead of capturing the query from the search engine page the 

highlighting application can provide a separate window or a search box for 
25 typing in the query. That query could then be sent to any search engine. The 
advantage of this approach is that the user need not retype the query if the user 
wants to use services of different search engines or other Web services in 
general. 

The query analyzer 330 is a (local or remote) service that takes the query 
30 term or any other short description on a topic as input, and returns an augmented 
set of terms to the cUent as a result. The query term analysis is completely 
independent of the actual search and can be processed in parallel while the 
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search engine is processing the query. In one embodiment, the analyzer is 
implemented as a remote service that accepts terms for analysis via a network 
connection. 

The original query terms and the augmented set of terms together 
5 represent the query context as indicated at 355. The system also makes 

association between the result page and the query context in order to ensure the 
original query is used for default highUghting until the user expUcitly changes 
the context. When the user browses the Web within this query context (by 
choosing one of the links that represents a document found by the search 
10 engine), the corresponding terms are highhghted in the accessed document at 
360. 

Note that there can be any number of active contexts stored in the 
terminology highlighter. The association between the result page and the original 
13 query may be used to enforce the default highlighting of all the documents on 

Jf^ 1 5 the result hst. For instance, if a user returns to the result page of a previous 

H query, the terms of that query context will be highhghted if a docxmient is 

browsed to from the result page. Additionally, terms of one context can be 
;^ apphed to and highhghted within documents of a different query context, and 

is new contexts can be constructed by combining terms of other contexts (for 

J5 20 example the terms of several related queries can be combined or merged to build 

W anew context). 

ip In the profile mode 320, the user can provide (e.g., by means of a dialog 

'^'"^ box) a description of the topic of interest at 365 which is then analyzed at 330 

analogously to the user's query to provide an augmented set of profile terms. 
25 This set of profile terms may be created in parallel with other activities that the 
user may perform and is then used as a basis for highlighting 360 of all 
subsequent documents that the user accesses either in real time, or as a 
background task. The model of the users interest may also be used as a basis for 
highlighting 360. 

30 In Figure 4 a block diagram shows components involved in providing 

augmented search terms and highhghting generally at 41 0. A user query (in the 
search mode) or the description of the user's interest (e.g., in the browsing 
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mode) is represented at 415 and is generated by a user for sending to a search 
engine or providing it to the read system as an interest profile. The query may 
be created on a search engine page, or may also be created on the client side in a 
separate window or search box, and then sent to the search engine. User context 
information is gathered at 420, and comprises an analysis of the tasks that a user 
is performing, and analysis of other searches or interest profiles that appear to be 
related. An analysis engine receives the query and context information, and (in 
one embodiment) uses natural language processing at 430 and semantic 
expansion at 435 to provide a model of the user's interest, which in one 
embodiment may be a set of augmented search terms 440 or a user interest 
profile. HighUghting of text is then performed at 445 based on the model 440, in 
one embodiment by selecting a bright background color for all terms found in 
the document. When used to mark or highhght portions of the document, the 
model provides the ability to better identify text which is more relevant to the 
actual intent of the user. Several different types of additional highUghting are 
described with reference to further figures below. In one embodiment the 
document text is accessed and analyzed statistically and linguistically. This 
analysis enables more sophisticated highlighting methods. For example, 
highUghting of terms that play a role of a subject or object in the query or profile 
description is more effective for reading a document than highlighting in the 
document all the concepts that appear in the query or the profile description. 
Similarly, query and interest profile terms could be highhghted in the document 
text only if they appear to have a specific linguistic role, e.g., the role of a 
subject or object. 

In Figure 5, a flow diagram indicated generally at 510 shows scrolling of 
a document to its most relevant portion based on the analysis of the document 
text. A next document identified in search results or accessed by browsing is 
received at 515. Subparts of the document are identified at 520. The subparts 
may be passages, sentences, fines, or paragraphs, all of a desired length or the 
length determined based on the distribution of query terms in the text. The 
subparts may in fact overlap if desired. Each of the subparts is then scored at 
525 in one of several well known relevance matching function with respect to 
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the model of the usefs interest. Statistics from any reference corpus can be used 
for that purpose. The scoring may also be similar to that used by the search 
engine, but may also include the use of the model to give a better indication of 
relevancy. Further, a best portion of the document may be identified by 
combining consecutive paragraph scores or applying another method, such as (in 
one embodiment) a Hidden Markov Model (well known in the art) to identify the 
best passage at 530. At 535, the document is scrolled to the most relevant 
passage as identified above. The most relevant passage may be scrolled to in the 
actual document, or may be part of a hst of passages which are provided with a 
link at 540 to corresponding documents. This provides a document hst showing 
the most relevant passage of each document to enable the user to determine 
which document may be most relevant. If the later, decision block 545 
determines whether the document received was the last document in the search 
results, or selected portion of search results for this function. If not, the next 
document is received at 515, and its most relevant portion identified. If it was 
the last document, control is returned at 550. 

In one embodiment the scrolling of the document is based purely on the 
pattern matching of the document text with the query or model of the user's 
interest. For example, the document is automatically scrolled to the first 
occurrence in the text of an important concept in the query or model. Further, the 
document can be scrolled to the paragraph with the highest density of the query 
or correlation with the model of the user's interest. These document scrolling 
methods do not require accessing and analyzing document text in advance. 

In Figure 6, a flow diagram indicated generally at 610 shows re-ranking 
of a hst of documents provided by a search engine or the documents that are 
linked to the currently viewed document via hyperlinks. In the search mode, the 
hst of documents is received at 615, and the top N documents referred to as best 
hits by the search engine are accessed from the respective servers at 620, as a 
background task while the user may be looking at the hst, or performing other 
tasks. N may range from 2 to as many as resource constraints permit. N is 30 in 
one embodiment. The entire document, or some number (K) of pages of the 
document may be used. Each document may then be scored at 625 in its entirety 
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or similarly to the portion scoring as described previously using a relevance 
matching method. The scoring may be based on the model, including at least 
augmented search terms and linguistic analysis of the document text. The list of 
documents is then sorted in accordance with the document scores at 630. An 
5 altemative rank of each of the documents can be provided, or a new list of less 
than N provided. The list is then provided to the user at 635, and control is 
retumed at 640. 

In the browsing mode, the hst of documents received at 615 represent all 
the document linked to the currently viewed document. The documents are 
10 accessed from the respective servers at 620 in the background and scored at 625 
for relevance with respect to the model of the user's interest that the current 
document may be associated with. The resulting score for each linked document 
is then displayed in relation to the document link on the current page and serves 
as a guide for following the links if desired. 
15 In Figure 7, a flow diagram indicated generally at 710 shows 

identification and provision of a list of entities (such as names associated with a 
document) and relations among entities in a document. A document is received 
at 715, and documents are downloaded at 720. Heuristics for identifying entity 

ii names and relations among entities (e.g., for person names that may include 

G 

iip 20 recognizing titles, capitalization, position and function in the sentence, etc.) 

f£ combined with lexicon lookups, are then applied to identify entity names and 

Ml 

p relations in the document at 725. A list of entity names and relations is created 

''''' at 730. At 735, links into the document corresponding to the entity names and 

relations are provided. In one embodiment, the list of extracted entities is 
25 displayed in a separate window, and each entity is supphed with navigational 
features, such as an up and down arrow to navigate to next and previous 
occurrences of the entity in the document. Information about the particular 
entity or entity relation may be extracted from additional resources at 740. For 
example, if the entity is a company name, appropriate information services 
30 providing information about such entities can be used to supply a link to the web 
site of the particular company. If the entity is a person name, the user may be 
able to access a person's web site using appropriate information services, or if 
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the person is a publicly known figure, the latest information available from the 
press. Similarly, if two entities, for example a person with the name N and a 
company with the name C are connected through the relationship is the 
President of C" the system can provide the link to the pages where the person N 
5 is mentioned as the President of C. This feature may apply to a variety of 

entities, such as geographical features, countries, trademarks, etc. and typical or 
important relations among such entities . The list of entity names and relations 
with links is provided to the user at 745, and if the last document has been 
processed at 750, control is returned at 755. This process may be apphed to a 

10 selected number of documents, or may continue in the background as long as is 
desired, or until the context is switched. 

In Figure 8, a flow diagram indicated generally at 810 shows creation of 
a thumbnail of a document with highlighting. A next document is received 
through browsing or downloaded at 815 from the list of documents provided by 

15 a search engine. If the accessed document can be viewed as a single screen 
document (of some default size, for example) a thumbnail of the whole 
document is created. On the Web the concept of a page is different from 
traditional paper documents. The size of a page can be a fixed size specified by 
the user or the system, or can be based on the size of the window used to view 

20 the docimient. For multi-page documents the most relevant passages caa be 
found at 820, and a thumbnail of the page contain the best passage created at 
825. 

The thumbnail appears as a single sheet of paper and may either relate to 
the first page of a document, or some scaled version or abstract representation of 

25 the document. Larger documents may even be displayed as a stack of 

thumbnails with navigation there between. As an alternative, the thumbnail of 
multi-page documents can be created at 825 without identifying the most 
relevant passages as represented by broken line 828. Instead, the thumbnail may 
be an abstract representation of the whole document in the form of a fixed length 

30 page partitioned into blocks that corresponds to pages. They can be colored to 
reflect the presence of important terminology in the particular part of the 
document. For example, the color of the particular block can be related to the 
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color used to highlight the most prominent term in that part of the dociunent. 
The result of this approach is a thumbnail filled with the spectrum of colored 
blocks that visualize the relevance of each part of the document. 

At 830, portions of the thumbnail corresponding to the most relevant 

5 passages are highlighted. Portions may also be highlighted without assessing the 
relevance of the passages. Links are then provided at 835 from the highUghted 
portions to the corresponding passages or portions of the document. The 
thumbnail is then displayed to the user at 840, and the process is repeated based 
on decision block 845 for a selected number of documents. Control is returned 

10 at 850. 

In one embodiment the thumbnail highlighting is based on the pattem 
matching of the query terms or interest profile terms without deeper linguistic 
analysis of the document text and identification of relevant passages. Generally, 
O thumbnail highlighting can be done with respect to any information about the 

iiS 15 user's interest or information extracted from the document. 

In Figure 9, a flow diagram indicated generally at 910 shows creation of 
a summary of a document. A next docxmient is received at 915, and the most 
relevant passages with respect to the model which may include the query (in the 
search mode) or interest profile (in the browsing mode) or independent from the 

13 

lifl 20 current user's context are identified at 920 as previously described. Selected 

passages are then extracted and assembled to form a summary at 925. In this 

y I 

iy embodiment, the summaries are created by extracting sentences from the text 

'^"■'^ that contain prominent query terminology . The summary may also be limited to 

a predetermined length, with the most relevant passages or sentences being used 
25 first. 

Portions of the summary are highlighted at 930, and links are created 
therefrom to corresponding portions of the document at 935. The summary is 
then displayed to the user at 940, and further documents are processed in the 
same manner based on decision block 945. Control is returned at 950. 
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CoDclusion 

A highlighting facility on a computer provides information to a user to 
independently assist the user in evaluating the relevance of documents identified 
by a search engine or some other information providing service in response to a 
user query or the relevance of documents accessed in a browsing mode in 
relation to a particular user's interest. When accessing documents identified as 
relevant by the information providing serviceor in the browsing mode from other 
networked computers, the facility determines why a document may be of 
interest, and provides information or highlighting to assist the user in 
determining whether the document is desired. 

An important characteristic of the Web is a separation of data gathering 
and indexing from information deUvery and presentation. The information 
highlighting faciUty deals with the presentation and information highlighting of 
docximents to facilitate reading, comprehension, and assimilation of information 
found in the accessed documents. Information highlighting is independent of the 
search, and thus searches from multiple different search engines can be 
relevance assessed and ranked together in a consistent manner. By providing the 
highUghting based on actual retrieved documents, up to date versions of the 
documents are assured. The facihty may base relevancy of a retrieved document 
on the original query, or a model of the user's interest, which may include an 
augmented set of search terms or enhanced version of the query which takes into 
account the general interest of the user as captured by an interest profile and 
context of use of the computer by the user, or a combination thereof. This 
provides a consistent and enhanced ability to correctly identify relevance of each 
document, rather than rely on the search engine basing relevance purely on a 
single query. 

Linguistic analysis and semantic expansion to provide the augmented 
version or set of terms is done in parallel with the execution of the query by one 
or more search engines to provide relevance more quickly. The model of the 
user's interest is then applied by the facility to documents as they are accessed 
through a browser to provide highlighting of relevant portions of the document. 
The model can be thought of as an interest profile context, or representation of 

26 

777.370US1 



the user* s information need. When browsing the web within this context or 
session, the corresponding terms are highUghted in the accessed documents. 

The faciUty may also be run as a remote service on a powerful computer 
(in contrast to the possibly less powerful local computers use by the user to 
5 further speed up processing and minimize delays. The remote service computer 
may in fact have a much higher bandwidth connection to the network, and be 
able to process many documents while the user is still considering the list of 
documents returned by the search engine or some other information providing 
service. 

10 Documents may be scrolled to the most relevant portion of a multi-page 

document based on pattern matching of the document text with the query or 
interest profile terms or by relevance scoring of individual paragraphs or 
subparts of the document based on the model. The list of documents provided 
y by one or more search engines may also be re-ranked based on relevance ranking 

111 15 and based on a representation of the usefs need. The re-ranking may be based on 

summaries provided by the search engines, or by actually retrieving the 
documents and either pattem matching with the augmented terms or performing 
III a deeper linguistic and statistical analysis of the document text, or based on the 

model and assessing the document relevance to the query, 
tfl 20 Information, such as names of entities (e.g,, the person's or a company 

iJSi name) and the relations among the entities may be extracted using well knovra 

y heuristics and lexicon lookups, and provided as a list, linked back into the 

document. For such names and relations, external links can also be found by 
local lookup or query and provided to the user. Further, based on the 
25 downloaded documents, thumbnails of the documents may be created with 
highlighting corresponding to the most relevant portions of the documents. 
Links to the document are provided within the thumbnail based on the 
highlighting or discrete portions within the thumbnail corresponding to the 
relevant portions of the document. The thumbnail provides a visual 
30 representation of the relevance of the entire document and allows the user to 
quickly identify an area of the document to help determine its relevance. 
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A summary of the document text can be provided by extracting salient 
sentences from the text as identified by pattern matching with the augmented 
terms or a deeper hnguistic and statistical analysis of the docxmient text, or based 
on the model Summaries can also be generated based on important entities and 
5 entity relations found in the text, related to or independent fi-om the current 
user's interest or query context. In a browsing mode, the internal and external 
links on a web page currently viewed can be assessed by downloading the text of 
linked documents in the background and assessing their relevance to the user's 
need and interest. Such information may be communicated to the user as an aid 
10 in deciding whether or not to follow the links. 

These different ways of providing relevance information can be divided 
into categories based on whether they require analysis of the target documents or 
not. Some can be effectively implemented based on a very shallow analysis of 
:p the document text, practically by pattern matching without having to access the 

J,?S 15 document in advance. These include highhghting, scrolling and thumbnail 

;^ creation and highlighting. Some ways are better implemented by downloading 

|hj the document text and providing a deeper linguistic analysis of the text. These 

;^ include more sophisticated document highhghting, scrolling and thumbnail 

highlighting, entity extraction and entity relation finding, summarization of 
ir^ 20 documents, re-ranking of the retrieved documents and assessment of hyperlinks 

5 ;j in the documents. 

Ill 

p The model of the user's interest may also vary across a broad spectrum 

'"^ from simple to more detailed. The original user's description of the query may 

be used in one embodiment. Further variations include using the augmented 
25 query, an original description of the interest profile, an enhanced description of 
the interest profile, general interest profiles which are not user specific, but are 
selected from some topical hierarchy - a library of topic profiles, and 
query/interest profile combined with information about the user's task. 

In the present invention, document presentation and document analysis 
30 features within a distributed computer network environment are provided where 
document gathering, indexing and relevance assessment with respect to a user's 
query is independent from document delivery and presentation to the user. The 
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user's need is separated from the search strategy. In other words, the user's query 
and interest profile are modeled independently from search activities such as by 
applying linguistic analysis. Further, support for relevance assessment is 
provided in both the search and browsing modes. The user interest model is 
applied to view and analyze documents that are accessed as a result of the search 
activity or by browsing Web documents. 

This apphcation is intended to cover any adaptations or variations of the 
present invention. It is manifestly intended that this invention be limited only by 
the claims and equivalents thereof. 
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We claim: 

L A computer implemented method of displaying documents accessed in a 
search or browsing mode, the method comprising: 

creating a model of a user's interest; 

accessing documents from a source of such documents; 

applying the model of the user's interest to the retrieved documents; and 

generating information regarding the relevancy of the retrieved 
documents. 

2. The method of claim 1 wherein the model comprises a query which is 
enhanced based on linguistic analysis. 

3. The method of claim 2 wherein the linguistic analysis comprises 
syntactic and semantic analysis. 

4. The method of claim 1, wherein the model comprises a query which is 
enhanced based on a general interest profile. 

5. The method of claim 4, wherein the general interest profile is applied 
equally to documents accessed by the user in both search and browsing modes. 

6. The method of claim 1, wherein the model of user interest is based at 
least partially on the user task. 

7. The method of claim 1 , wherein the information is used to highUght 
relevant portions of text in the retrieved documents. 

8. The method of claim 1, wherein the model comprises a query which is 
enhanced independently of and during the execution of the query by the search 
engine. 
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9. The method of claim 1, wherein the model comprises a query which is 
appHed to the accessed documents to assess relevance during retrieval of 
documents from their sources. 

10. The method of claim 1 , wherein documents are retrieved while a user that 
generated a query may be performing other tasks. 

11. A computer readable medium having instructions stored thereon that 
causes a computer to perform the method of claim 1 . 

12. A computer implemented method of enhancing query results provided 
independent of a search engine, the method comprisuig: 

sending a query to an independent search engine; 
receiving query results from the search engine; and 
generating information regarding the relevancy of the query results from 
the results independent of the search engine. 

13. The method of claim 12, wherein the information is used to highHght 
relevant portions of text in the retrieved documents. 

14. The method of claim 12, wherein documents are retrieved while a user 
that generated the query may performing other tasks. 

15. A computer readable medium having instructions stored thereon that 
causes a computer to perform the method of claim 12. 

16. A computer implemented method of enhancing query results provided 
independent of a search engine, the method comprising: 

sending a query to an independent search engine; 
creating a context based on a computer user's interests; 
receiving query results from the search engine; and 
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generating information regarding the relevancy of the query results from 
the results independent of the search engine and based upon the context, 

17. The method of claim 16, wherein each new search within the context 
results in information being generated for documents identified by such search 
based upon such context. 

18. A computer implemented method of enhancing query results provided 
independent of a search engine, the method comprising: 

sending a query to a search engine separate from the computer; 
receiving query results from the search engine; 

enhancing the query; 

accessing documents identified by the query results; 

applying the enhanced query to the retrieved documents; and 
generating information regarding the relevancy of the retrieved documents based 
on the enhanced query. 

19. The method of claim 18 wherein the query is enhanced based on 
hnguistic analysis. 

20. The method of claim 1 9 wherein the linguistic analysis comprises 
syntactic and semantic analysis. 

2 1 . The method of claim 1 8, wherein the query is enhanced based on a 
general interest profile, 

22. The method of claim 21, wherein the general interest profile is applied 
equally to docxmients accessed by the user in both search and browsing modes. 

23. The method of claim 18, wherein the query is enhanced based on a model 
of user interest generated independent of search results. 
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24. The method of claim 1 8, wherein the information is used to highlight 
relevant portions of text in the retrieved documents. 

25. The method of claim 18, wherein the query is enhanced during retrieval 
of documents from their sources. 

26. A computer readable medium having instructions stored thereon that 
cause a computer to perform the method of claim 18. 

27. A computer implemented method of enhancing query results provided 
independent of a search engine, the method comprising: 

sending a query to a search engine separate from the computer; 
receiving ranked query results from the search engine; 
accessing documents identified by the query results; 
re-ranking the query results based on information contained in the 
retrieved documents. 

28. A computer readable medium having instructions stored thereon that 
cause a computer to perform the method of claim 27. 

29. A computer implemented method of enhancing query results provided 
independent of a search engine, the method comprising: 

sending a query to a search engine separate from the computer; 
receiving ranked query results from the search engine; 
augmenting the query; and 

re-ranking the query results based on the augmented query. 

30. A computer implemented method of enhancing query results provided 
independent of a search engine, the method comprising: 

sending a query to an independent search engine; 
receiving query results from the search engine; 
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retrieving a document; and 

scrolling to a most relevant portion of the retrieved document. 

3 1 . The method of claim 30, wherein the document is divided into sections, 
and wherein a relevancy score is generated for each section. 

32. The method of claim 3 1 wherein the most relevant portion is the section 
with the highest score. 

33. The method of claim 3 1 wherein one or more sections overlap other 
sections. 

34. The method of claim 3 1 wherein each section is a paragraph. 

35. The method of claim 31 wherein each section is a sentence. 

36. The method of claim 3 1 wherein each section comprises a predetermined 
number of lines. 

37. A computer readable medium having instructions stored thereon that 
cause a computer to perform the method of claim 30. 

38. A computer implemented method of enhancing query results provided 
independent of a search engine, the method comprising: 

sending a query to an independent search engine; 
receiving query results from the search engine; 
retrieving a document identified in the query results; and 
extracting names from the document and identifying associated links to 
such names. 

39. The method of claim 38 wherein the names comprise names of people or 
companies. 
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The method of claim 38 wherein the links are internal to the document. 



41 . The method of claim 38 wherein the links are external to the document. 

42. The method of claim 38 wherein the names are provided in a list next to 
the query results to help identify the relevance of documents. 

43 . A computer readable medium having instructions stored thereon that 
cause a computer to perform the method of claim 38. 

44. A computer implemented method of enhancing query results provided 
independent of a search engine, the method comprising: 

sending a query to an independent search engine; 
receiving query results from the search engine; 
retrieving a document identified by such query results; and 
creating a thumbnail view of the document with portions of the view 
highlighted based on relevancy of corresponding portions of the document. 

45. The method of claim 44 wherein the highlighted portions correspond to 
links back to corresponding portions of text in the document. 

46. The method of claim 44 and further comprising enhancing the query. 

47. The method of claim 46 wherein the relevancy of the portions is 
determined based at least partially on the enhanced query. 

48. The method of claim 46 wherein the query is enhanced based on 
linguistic analysis. 

49. The method of claim 46, wherein the query is enhanced based on a 
general interest profile. 
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50. The method of claim 46, wherein the query is enhanced during retrieval 
of documents. 

5 1 . The method of claim 44, wherein documents are retrieved while a user 
that generated the query may performing other tasks. 

52. A computer readable medium having instructions stored thereon that 
cause a computer to perform the method of claim 44. 

53. A computer implemented method of enhancing query results provided 
independent of a search engine^ the method comprising: 

sending a query to an independent search engine; 
receiving query results from the search engine; and 
retrieving a document identified by such query results; 
identifying relevant portions of the document; and 
generating a summary of the document comprising the most relevant 
portions identified. 

54. The method of claim 53, wherein the document is divided into sections, 
and wherein a relevancy score is generated for each section. 

55. The method of claim 54 wherein the most relevant portions are the 
sections with the highest score. 

56. The method of claim 54 wherein each section is a sentence. 

57. A computer readable medium having instructions stored thereon that 
cause a computer to perform the method of claim 53, 

58. A computer system for enhancing query results provided independent of 
a search engine, the system comprising: 
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a module that sends a query to a search engine separate from the 
computer; 

a module that receives query results from the search engine; 

a module that retrieves documents identified by the query results; 

a module that enhances the query; 

a module that applies the enhanced query to the retrieved documents; and 
a module that generates information regarding the relevancy of the 
retrieved documents. 

59. The system of claim 58 wherein the query is enhanced based on 
linguistic analysis. 

60. The system of claim 59 wherein the linguistic analysis comprises 
syntactic and semantic analysis. 

61 . The system of claim 58, wherein the query is enhanced based on a 
general interest profile. 

62. The system of claim 61 ^ wherein the general interest profile is applied 
equally to documents accessed by the user in both search and browsing modes. 

63. The system of claim 58, wherein the query is enhanced based on a model 
of user interest generated independent of search results, 

64. The system of claim 58, wherein the information is used to highhght 
relevant portions of text in the retrieved documents. 

65. The system of claim 58, wherein the query is enhanced during retrieval 
of documents. 

66. The system of claim 58, wherein documents are retrieved while a user 
that generated the query may performing other tasks. 
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67. A computer system for enhancing query results provided independent of 
a search engine, the system comprising: 

a module that sends a query to an independent search engine; 
a module that receives query results from the search engine; and 
a module that generates information regarding the relevancy of the query 
results from the results independent of the search engine. 

68. The system of claim 67, wherein the information is used to highlight 
relevant portions of text in the retrieved documents. 

69. A computer system for enhancing query results provided independent of 
a search engine, the system comprising: 

a module that sends a query to an independent search engine; 

a module that creates a context based on a computer user's interests; 

a module that receives query results from the search engine; and 

a module that generates information regarding the relevancy of the query 
results from the results independent of the search engine and based upon the 
context, 

70. The system of claim 69, wherein each new search within the context 
results in information being generated for documents identified by such search 
based upon such context. 

71. A computer system for enhancing query results provided independent of 
a search engine, the system comprising: 

a module that sends a query to a search engine separate from the 
computer; 

a module that receives query results from the search engine; 

a module that retrieves documents identified by the query results; 

a module that enhances the query; 

a module that applies the enhanced query to the retrieved documents; and 
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a module that generates information regarding the relevancy of the 
retrieved docxmients based on the enhanced query. 

72. A computer system for enhancing query results provided independent of 
a search engine, the system comprising: 

a module that sends a query to a search engine separate from the 
computer; 

a module that receives ranked query results from the search engine; 
a module that retrieves docimients identified by the query results; 
a module that re-ranks the query results based on information contained 
in the retrieved documents. 

73. A computer system for enhancing query results provided independent of 
a search engine, the system comprising: 

a module that sends a query to a search engine separate from the 
computer; 

a module that receives ranked query results from the search engine; 
a module that augments the query; and 

a module that re-ranks the query results based on the augmented query. 

74. A computer system for enhancing query results provided independent of 
a search engine, the system comprising: 

a module that sends a query to an independent search engine; 
a module that receives query results from the search engine; 
a module that retrieves a document; and 
a module that scrolls to a most relevant portion of the retrieved 
document. 

75. The system of claim 74, wherein the document is divided into sections, 
and wherein a relevancy score is generated for each section. 
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76. The system of claim 75 wherein the most relevant portion is the section 
with the highest score. 

77. The system of claim 75 wherein one or more sections overlap other 
sections. 

78. The system of claim 75 wherein each section is a paragraph. 

79. The system of claim 75 wherein each section is a sentence. 

80. The system of claim 75 wherein each section comprises a predetermined 
number of lines. 

81. A computer system for enhancing query results provided independent of 
a search engine, the system comprising: 

a module that sends a query to an independent search engine; 
a module that receives query results jfrom the search engine; 
a module that retrieves a document identified in the query resuUs; and 
a module that extracts names firom the document and identifying 
associated links to such names. 

82. The system of claim 81 wherein the names comprise names of people or 
companies. 

83. The system of claim 81 wherein the links are intemal to the document. 

84. The system of claim 81 wherein the links are external to the document. 

85. The system of claim 81 wherein the names are provided in a list next to 
the query results to help identify the relevance of documents. 
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86. A computer system for enhancing query results provided independent of 
a search engine, the system comprising: 

a module that sends a query to an independent search engine; 

a module that receives query results from the search engine; 

a module that retrieves a document identified by such query results; and 

a module that creates a thumbnail view of the document with portions of 
the view highhghted based on relevancy of corresponding portions of the 
document. 

87. The system of claim 86 wherein the highlighted portions correspond to 
links badk to corresponding portions of text in the document. 

88. A computer system for enhancing query results provided independent of 
a search engine, the system comprising: 

a module that sends a query to an independent search engine; 
a module that receives query results from the search engine; and 
a module that retrieves a document identified by such query results; 
a module that identifies relevant portions of the document; and 
a module that generates a summary of the document comprising the most 
relevant portions identified. 

89. The system of claim 88, wherein the document is divided into sections, 
and wherein a relevancy score is generated for each section. 

90. The system of claim 89 wherein the most relevant portions are the 
sections with the highest score. 

91. The system of claim 89 wherein each section is a sentence. 

92. A computer implemented method of enhancing a query for an 
independent search engine, the method comprising: 

sending a query to an independent search engine; and 
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independently modeling the query. 



93 . The method of claim 92 wherein the independently modeled query is 
applied to documents identified by the search engine, 

94. The method of claim 92 wherein the independently modeled query 
comprises an enhanced representation selected from the group consisting of an 
original user description of the query, an augmented query, an original 
description of an interest profile, an enhanced description of the interest profile, 
general interest profiles, and a query/interest profile combined with information 
about the user's task. 



95. The method of claim 92 wherein the independently modeled query is 
p applied to documents accessed in a browse mode. 

fir ;K; 

P. s 

A method of assessing relevance of documents, the method comprising: 
creating a user interest model; 

analyzing documents accessed independent from a search engine; and 
applying the user interest model to such documents. 

Jl 97. The method of claim 96 and further comprising highlighting relevant 

O portions of the documents based on the application of the user interest model to 

such documents. 



96. 



98. The method of claim 96 wherein the user interest model is applied to 
documents accessed from independent search results, or in a browsing mode. 

99. The method of claim 96 and further comprising enhancing relevant 
portions of accessed documents for use by a user. 

1 00. The method of claim 99 wherein the enhancing of relevant documents is 
selected from the group consisting of document highlighting, document 
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scrolling, document thumbnails, document re-ranking, hyperlink relevance 
assessment, entity extraction, entity relation finding, and document 
summarization. 

101. The method of claim 99 wherein the enhancing of relevant documents is 
performed as documents are downloaded, 

102. The method of claim 101 wherein the enhancing of relevant documents is 
selected from the group consisting of highUghting all occurrences of 
query/interest profile terms, scrolling to the first occurrence of important 
concepts, scrolling to paragraphs with higher density of query/interest profile 
terms, providing thumbnails highhghting all terms occurrences, and providing 
thumbnails highlighting paragraphs or portion of the text with various densities 
of query/interest profile terms. 

1 03. The method of claim 99 wherein the enhancing of relevant documents is 
performed while documents are downloaded as a background task. 

104. The method of claim 101 wherein the enhancing of relevant documents is 
selected from the group consisting of selective highhghting of occurrences of 
query/interest profile terms, scrolling to the most relevant passages in the 
document, providing thumbnails highlighting or relevant passages, document re- 
ranking/hyperUnk relevance assessment based on relevance scoring, entity 
extraction and entity relation finding, extraction of sentences containing context 
terms, and generating summaries that contain information from both a context 
and the document. 

1 05. The method of claim 99 wherein the enhancements are based on a 
shallow document text analysis or on a deep linguistic and statistical analysis of 
the document text. 
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106. The method of claim 96 wherein the user interest model is selected from 
the group consisting of an original user's description of a query/interest profile, 
an augmented query/enhanced description of the interest profile, general interest 
profiles, and query/interest profile combined with information about the user's 
task. 

107. A computer readable medium having instructions stored thereon that 
cause a computer to perform the method of claim 96. 

108. A method of assessing relevance of documents, the method comprising: 
creating a user interest model; 

analyzing documents accessed; and 

applying the user interest model to such docimients to highUght relevant 
portions of the docimients based on the user interest model. 
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Abstract of the Disclosure 



Aninformation highlighting facihty assists the user in evaluating 
relevance of accessed documents to the user's information need. The accessed 
documents may, for example, be identified by a search engine in response to a 
user query. When accessing documents identified as relevant by a search engine 
from other networked computers, the facility provides information highlighting 
to assist the user in determining whether the document is relevant. A model of 
the usefs interest, which may include an augmented set of search terms is used 
to take into account the general interest of the user as captured by an interest 
profile and context of use of the computer by the user, or a combination thereof. 
The model of the user's interest is applied to the document text as the document 
is accessed from its source. The highhghting of information about the document 
content may include highlighting of the terminology in the text, scrolling of the 
document to the relevant passages, identification of entity names and entity 
relations, creation of a document summary and a document thumbnail, etc. In 
addition, the model can be appUed to a set of documents accessed by the user, 
e.g., to re-rank the top scoring documents from the result set provided to the user 
by a search engine or some other information providing services. 
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THROUGH SEARCH OR BROWSING . 

The specification of which is attached hereto. 

I hereby state that I have reviewed and understand the contents of the above-identified specification, 
including the claims, as amended by any amendment referred to above. 

I acknowledge the duty to disclose information which is material to the patentability of this application in 
accordance with 37 C.F.R. § 1.56 (attached hereto). I also acknowledge my duty to disclose all information known 
to be material to patentability which became available between a filing date of a prior appUcation and the national or 
POI international filing date in the event this is a Continuation-In-Part appUcation in accordance with 37 C.F.R. 
§a3(e). 

:-J I hereby claim foreign priority benefits under 35 U.S.C. § 1 19(a)-(d) or 365(b) of any foreign application(s) 
f|? patent or inventor's certificate, or 365(a) of any PCT international application which designated at least one 
cliiijntry other than the United States of America, listed below and have also identified below any foreign application 
firlpatent or inventor's certificate having a fiHng date before that of the application on the basis of which priority is 
claimed: 

Ncj such claim for priority is being made at this time. 

I hereby claim the benefiit under 35 U.S.C. § 1 19(e) of any United States provisional application(s) hsted 

t^ow: 

No such claim for priority is being made at this time. 

I hereby claim the benefit under 35 U.S.C. § 120 or 365(c) of any United States and PCT international 
application(s) Hsted below and, insofar as the subject matter of each of the claims of this application is not disclosed 
in the prior United States or PCT international application in the manner provided by the first paragraph of 35 U.S.C. 
§ 112, 1 acknowledge the duty to disclose material information as defined in 37 C.F.R. § 1 .56(a) which became 
available between the fiHng date of the prior appHcation and the national or PCT international filing date of this 
application: 



No such claim for priority is being made at this time. 
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I hereby appoint the following attomey(s) and/or patent agent(s) to prosecute this application and to transact 
all business in the Patent and Trademark Office connected herewith: 



Anglin, J. Michael 
Bianchi, Timothy E. 
Billion, Richard E. 
Black, David W. 
Brennan, Leoniede M. 
Brennan, Thomas F. 
Brooks, Edward T,in 
Chu, Dinh CP 
Clark, Barbara J. 
Crouse, Daniel D. 
Dahl, JohnM. 
Drake, Eduardo E. 
Eliseeva, Maria M. 
Embretson, Janet E. 
Fordenbacher, Paul J. 
Forrest, Bradley A. 
Gamon, Owen J 
Harris, Robert J. 



Reg. No. 
Reg- No. 
Reg, No- 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 



J. No 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 
Reg. No. 



24,916 
39,610 
32,836 
42,331 
35,832 
35,075 
40,925 
41,676 
38,107 
32,022 
44,639 
40,594 
43,328 
39,665 
42,546 
30,837 
36,143 
37,346 



Huebsch, Joseph C. Reg. No, 42,673 

Jurkovich, Patti J, Reg. No. 44,8 1 3 

Kalis, Janal M. Reg. No. 37,650 

Kaufmaim, John D. Reg. No. 24,017 
Klima-Silberg, Catherine L Reg No. 40,052 

Kluth, Daniel J. Reg. No. 32,146 

Lacy, Rodney L. Reg, No. 41,136 

Lemaire, Charles A. Reg. No. 36, 1 98 

LeMoine, Dana B. Reg, No. 40,062 

Litman, Mark A, Reg. No, 26,390 

Lundberg, Steven W. Reg. No. 30,568 

Mack, Lisa K. Reg. No. 42,825 

Maeyaert, Paul L. Reg. No. 40,076 

Maki, Peter C. Reg, No. 42,832 

Malen, Peter L, Reg, No. 44,894 

Mates, Robert E, Reg. No, 35,27 1 

McCrackin, Ann M. Reg. No. 42,858 

Nama,Kash Reg. No. 44,255 



Nelson, Albin J. Reg, No. 28,650 

Nielsen, Walter W. Reg. No. 25,539 

Oh, Alien! Reg. No. 42,047 

Padys, Danny J. Reg. No. 35,635 

Parker, J. Kevin Reg. No. 33,024 

Perdok, Monique M. Reg. No. 42,989 

Prout, William F. Reg No. 33,995 

Sako, Katie E. Reg. No. 32,628 

Schumm, Sherry W. Reg. No. 39,422 

Schwegman, Micheal L. Reg. No. 25,816 

Smitb, Michael G. Reg. No. 45,368 

Speier, Gary J, Reg. No. 45,458 

Steffey, Charles E. Reg. No. 25,179 

Terry, Kathleen R. Reg. No. 3 1 ,884 

Tong, Viet V. Reg. No. 45,416 

Viksnins, Ann S. Reg. No. 37,748 

Woessner, Warren D, Reg. No. 30,440 



O I hereby authorize them to act and rely on instructions from and communicate directly with the person/assignee/attomey/ 
f^organization/who/which first sends/sent this case to them and by whom/which I hereby declare that I have consented after full 
dilflosure to be represented unless/until I mstruct Schwegman, Lundberg, Woessner & Kluth, PA. to the contrary. 

PMse dhect all correspondence in this case to Schwegman, Lundberg, Woessner & Kluth, P.A. at the address indicated below: 
mi P.O. Box 2938, Minneapolis, MN 55402 

ihH^ Telephone No* (612)373-6900 



; I hereby declare that all statements made herein of my own knowledge are true and that all statements made on information and 
belief are believed to be true; and further that these statements were made with the knowledge that willful false statements and the like so 
made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that such willful false 
s|||ements may jeopardize the vahdity of the application or any patent issued thereon. 

Atll Name of joint hiventor number 1 : Natasa Milic-Frayling 

(Sjlzenship: United States of America Residence. , Great Britain 

felt Office Address: 39 Highfield Avenue 

D Cambridge CB4 2AJ 

Great Britain 



Signature: 

Natasa Milic-Frayling 



Full Name of joint inventor number 2 : Ralph Sommerer 

Citizenship: Switzerland Residence: , Great Britain 

Post Office Address: 62 Petersfield Mansions 

Mill Road 

Cambridge CBl IBB 
Great Britain 



Signature: « — 

Ralph Sommerer 
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§1.56 Duty to disclose information material to patentability. 

(a) A patent by its very nature is affected with a public interest. The public mterest is best served, and the most effective patent 
examination occurs when, at the thne an application is being examined, the Office is aware of and evaluates the teachings of all information 
material to patentability. Each individual associated with the filing and prosecution of a patent application has a duty of candor and good 
faith in dealmg with the Office, which includes a duty to disclose to the Office all information known to that individual to be material to 
patentability as defmed in this section. The duty to disclose information exists with respect to each pending claim until the claim is canceled 
or withdrawn from consideration, or the appHcation becomes abandoned. Information material to the patentability of a claim that is 
canceled or withdrawn from consideration need not be submitted if the information is not material to the patentabiHty of any claim 
remaining under consideration in the application. There is no duty to submit information which is not material to the patentability of any 
existing claim. The duty to disclose all information known to be material to patentability is deemed to be satisfied if all mformation known 
to be material to patentability of any claun issued in a patent was cited by the Office or submitted to the Office in the manner prescribed by 
§§ 1 .97(b)-(d) and 1 .98, However, no patent wiU be granted on an application in connection with which fraud on the Office was practiced 
or attempted or the duty of disclosure was violated through bad faith or intentional misconduct. The Office encourages applicants to 
carefully examine: 

(1) prior art cited in search reports of a foreign patent office in a counterpart appHcation, and 

(2) the closest information over which individuals associated with the filing or prosecution of a patent appHcation believe any 
pending claim patentably defmes, to make sure that any material information contamed therem is disclosed to the Office. 

ifb) Under this section, information is material to patentability when it is not cumulative to mformation aheady of record or being 
nif de of record in the application, and 

,^2 ( 1 ) It establishes, by itself or in combination with other information, a prima facie case of unpatentability of a claim; or 

(2) It refutes, or is inconsistent with, a position the applicant takes in: 
|y (i) Opposmg an argument of unpatentability relied on by the Office, or 

O (ii) Asserting an argument of patentability. 

Jolrima facie case of unpatentability is estabhshed when the information compels a conchision that a claim is unpatentable under the 
pr^onderance of evidence, burden-of-proof standard, givmg each term m the claim its broadest reasonable construction consistent with the 
specification, and before any consideration is given to evidence which may be submitted m an attempt to establish a contrary conclusion of 
pSentability. 

(c) Individuals associated with the filing or prosecution of a patent application within the meaning of this section are: 

(1) Each inventor named in the appHcation: 

(2) Each attorney or agent who prepares or prosecutes the application; and 

(3) Every other person who is substantively involved in the preparation or prosecution of the application and who is 
associated with the inventor, with the assignee or wifli anyone to whom there is an obligation to assign the appHcation. 

(d) Individuals other than the attorney, agent or inventor may comply with this section by disclosing information to the attomey, 
agent, or inventor. 



